Layered TPOT: Speeding up Tree-based Pipeline Optimization

نویسندگان

  • Pieter Gijsbers
  • Joaquin Vanschoren
  • Randal S. Olson
چکیده

With the demand for machine learning increasing, so does the demand for tools which make it easier to use. Automated machine learning (AutoML) tools have been developed to address this need, such as the Tree-Based Pipeline Optimization Tool (TPOT) which uses genetic programming to build optimal pipelines. We introduce Layered TPOT, a modification to TPOT which aims to create pipelines equally good as the original, but in significantly less time. This approach evaluates candidate pipelines on increasingly large subsets of the data according to their fitness, using a modified evolutionary algorithm to allow for separate competition between pipelines trained on different sample sizes. Empirical evaluation shows that, on sufficiently large datasets, Layered TPOT indeed finds better models faster.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automating Biomedical Data Science Through Tree-Based Pipeline Optimization

Over the past decade, data science and machine learning has grown from a mysterious art form to a staple tool across a variety of fields in academia, business, and government. In this paper, we introduce the concept of tree-based pipeline optimization for automating one of the most tedious parts of machine learning—pipeline design. We implement a Tree-based Pipeline Optimization Tool (TPOT) and...

متن کامل

TPOT: A Tree-based Pipeline Optimization Tool for Automating Machine Learning

As data science becomes more mainstream, there will be an ever-growing demand for data science tools that are more accessible, flexible, and scalable. In response to this demand, automated machine learning (AutoML) researchers have begun building systems that automate the process of designing and optimizing machine learning pipelines. In this paper we present TPOT v0.3, an open source genetic p...

متن کامل

An Improved Optimization Model for Scheduling of a Multi-Product Tree-Like Pipeline

In the petroleum supply chain, oil refined products are often delivered to distribution centers by pipelines since they provide the most reliable and economical mode of transportation over large distances. This paper addresses the optimal scheduling of a complex pipeline network with multiple branching lines. The main challenge is to find the optimal sequence and time of product injections/deli...

متن کامل

Towards a more efficient representation of imputation operators in TPOT

Automated Machine Learning encompasses a set of meta-algorithms intended to design and apply machine learning techniques (e.g., model selection, hyperparameter tuning, model assessment, etc.). TPOT, a software for optimizing machine learning pipelines based on genetic programming (GP), is a novel example of this kind of applications. Recently we have proposed a way to introduce imputation metho...

متن کامل

Speeding-up Mathematical Morphology Computations with Special-Purpose Array Processors

The rst part of this paper will analyze the computational complexity of the implementation of Mathematical Morphology operations on three diierent ar-chitectures: general-purpose serial systems, pipeline systems, and cellular systems. For each considered architecture , a diierent computing technique is devised, exploiting the speciic system characteristics, and obviously reaching diierent throu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017